Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable Kopia Maintenance Interval #8581

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kaovilai
Copy link
Member

@kaovilai kaovilai commented Jan 6, 2025

Signed-off-by: Tiger Kaovilai [email protected]

Thank you for contributing to Velero!

Please add a summary of your change

Configurable Kopia Maintenance Interval. repo-maintenance-job-configmap adds an option for fullMaintenanceInterval where fastGC (12 hours), and eagerGC (6 hours).
This configmap is loaded in assembleRepoParam that returns RepoParam which now contains the maintenance interval referenced by several repository functions.
The manager functions then call provider functions which read repoParam.FullMaintenanceInterval and add it to repoOption.
Repooption is then passed to repoService, in this case kopiaRepoService.Init
kopiaRepoService.Init calls writeInitParameters which finally set kopia p.FullCycle.Interval

Does your change fix a particular issue?

Fixes #8364

Please indicate you've done the following:

@kaovilai kaovilai force-pushed the configKopiaMaintInterval branch 2 times, most recently from 6e3d09f to 569c2ef Compare January 15, 2025 10:03
Copy link

codecov bot commented Jan 15, 2025

Codecov Report

Attention: Patch coverage is 7.69231% with 12 lines in your changes missing coverage. Please review.

Project coverage is 59.16%. Comparing base (3eaa739) to head (334ac16).
Report is 47 commits behind head on main.

Files with missing lines Patch % Lines
pkg/repository/udmrepo/kopialib/lib_repo.go 7.69% 9 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8581      +/-   ##
==========================================
- Coverage   59.18%   59.16%   -0.02%     
==========================================
  Files         370      370              
  Lines       39595    39825     +230     
==========================================
+ Hits        23434    23564     +130     
- Misses      14679    14757      +78     
- Partials     1482     1504      +22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kaovilai kaovilai force-pushed the configKopiaMaintInterval branch from 569c2ef to bcc6e39 Compare January 15, 2025 13:13
@kaovilai kaovilai force-pushed the configKopiaMaintInterval branch 3 times, most recently from 014d288 to 40af53f Compare January 15, 2025 13:54
@@ -130,8 +130,23 @@ velero install --default-repo-maintain-frequency <DURATION>
```
For Kopia the default maintenance frequency is 1 hour, and Restic is 7 * 24 hours.

### Full Maintenance Interval customization
The full maintenance interval defaults to kopia defaults of 24 hours. Velero provide two override options under `fullMaintenanceInterval` configuration.
- fastGC: 12 hours
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it makes sense to have defaultGC ? I know it's when there are no configuration added, but for usability maybe it makes sense to allow user ensure default is used.

on the second end of the intervals there may be a need to push default to something longer, for the datasets with minimal changes to save on compute costs ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have default as an explicit option -- it will make changes from eager/fast back to default consistent with the other changes, and it also keeps Velero consistent with 24 hour default even if Kopia changes their default i the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So add default and also infrequentGC ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that we need infrequentGC now -- but we could add it in the future if we get a customer complaining that they don't want to remove deleted blobs within 48 hours.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the name of the default option, I suggest we use normalGC. infrequentGC sounds more like a value below normal.

BackupRepo *velerov1api.BackupRepository
BackupLocation *velerov1api.BackupStorageLocation
BackupRepo *velerov1api.BackupRepository
FullMaintenanceInterval udmrepo.FullMaintenanceIntervalOptions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be part of the BackupRepositorySpec, which is part of BackupRepository similarly to:

MaintenanceFrequency metav1.Duration `json:"maintenanceFrequency"`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User do not author BackupRepositorySpec but yes we can certainly allow user write stuff there after..

Do we want interval to be set from configmap still? or just spec? or both where backupRepository is what is used, but configmap is used as initial value only for any new kopia repos?

ie. if not set use default, but upon huge kopia dir, user can edit backupRepository and velero will update repository maintenance internval parameters?

Copy link
Contributor

@Lyndon-Li Lyndon-Li Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BackupRepository CR is a low level API and should not be manipulated by users directly unless really necessary.
Therefore, we should always try to add the option to a configmap.

But instead of repo-maintenance-job-configmap, I suggest we add the option to backup-repository-configmap. The later maps the value to RepositoryConfig of BackupRepository CR, so you will not need to pass any new parameter to repo provider or unified repo.

Logically, this is also reasonable. repo-maintenance-job-configmap is to control the behavior of repo maintenance job but in fact the run of a repo maintenance job doesn't always result in a maintenance in the repo --- the job may start and quit without doing anything if the repo thinks none of the maintenance operations are due. backup-repository-configmap is to control the behavior of unified repo, and when to do a maintenance is an internal behavior of unified repo.

Copy link
Contributor

@Lyndon-Li Lyndon-Li Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, MaintenanceFrequency of BackupRepository CR is set by Velero according to the --default-repo-maintain-frequency server option, unified repo interface DefaultMaintenanceFrequency method or defaultMaintainFrequency global value, so it should not be edited by users directly unless really necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not aware backup-repository-configmap was a thing.. will check. Thanks.

pkg/repository/udmrepo/kopialib/lib_repo.go Outdated Show resolved Hide resolved
@@ -130,8 +130,23 @@ velero install --default-repo-maintain-frequency <DURATION>
```
For Kopia the default maintenance frequency is 1 hour, and Restic is 7 * 24 hours.

### Full Maintenance Interval customization
The full maintenance interval defaults to kopia defaults of 24 hours. Velero provide two override options under `fullMaintenanceInterval` configuration.
- fastGC: 12 hours
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have default as an explicit option -- it will make changes from eager/fast back to default consistent with the other changes, and it also keeps Velero consistent with 24 hour default even if Kopia changes their default i the future.

@kaovilai kaovilai force-pushed the configKopiaMaintInterval branch from 40af53f to 14200ad Compare January 15, 2025 15:13
@@ -348,9 +348,21 @@ func (m *manager) assembleRepoParam(repo *velerov1api.BackupRepository) (provide
if err := m.client.Get(context.Background(), client.ObjectKey{Namespace: m.namespace, Name: repo.Spec.BackupStorageLocation}, bsl); err != nil {
return provider.RepoParam{}, errors.WithStack(err)
}
jobConfig, err := repository.GetMaintenanceJobConfig(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the comment https://github.com/vmware-tanzu/velero/pull/8581/files#r1917579231, let's use backup-repository-configmap, if so, this code is not required.

@kaovilai kaovilai force-pushed the configKopiaMaintInterval branch from 14200ad to cc27593 Compare January 16, 2025 10:28
@kaovilai kaovilai force-pushed the configKopiaMaintInterval branch from cc27593 to 334ac16 Compare January 16, 2025 10:29
Comment on lines +141 to +154
apiVersion: v1
kind: ConfigMap
metadata:
name: <config-name>
namespace: velero
data:
<repository-type-1>: |
{
"fullMaintenanceInterval": fastGC
}
<repository-type-2>: |
{
"fullMaintenanceInterval": normalGC
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pulled this from design, since velero.io docs lack this example still.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Configurable Kopia Maintenance Interval
4 participants